Geometric standard deviation

In probability theory and statistics, the geometric standard deviation describes how spread out are a set of numbers whose preferred average is the geometric mean. For such data, it may be preferred to the more usual standard deviation.

Contents

Definition

If the geometric mean of a set of numbers {A1, A2, ..., An} is denoted as μg, then the geometric standard deviation is

 \sigma_g = \exp \left( \sqrt{ \sum_{i=1}^n ( \ln { A_i \over \mu_g } )^2 \over n } \right). \qquad \qquad (1)

Derivation

If the geometric mean is

 \mu_g = \sqrt[n]{ A_1 A_2 \cdots A_n  }.\,

then taking the natural logarithm of both sides results in

 \ln \mu_g = {1 \over n} \ln (A_1 A_2 \cdots A_n).

The logarithm of a product is a sum of logarithms (assuming A_i is positive for all i), so

 \ln \mu_g = {1 \over n} [ \ln A_1 %2B \ln A_2 %2B \cdots %2B \ln A_n ].\,

It can now be seen that  \ln \, \mu_g is the arithmetic mean of the set  \{ \ln A_1, \ln A_2, \dots , \ln A_n \} , therefore the arithmetic standard deviation of this same set should be

 \ln \sigma_g = \sqrt{ \sum_{i=1}^n ( \ln A_i - \ln \mu_g )^2 \over n }.

This simplifies to

 \sigma_g = \exp{\sqrt{ \sum_{i=1}^n (  \ln { A_i \over \mu_g } )^2 \over n }}.

Geometric standard score

The geometric version of the standard score is

 z = {\ln ( x/\mu_g ) \over \ln \sigma_g }.\,

If the geometric mean, standard deviation, and z-score of a datum are known, then the raw score can be reconstructed by

 x = \mu_g \sigma_g^z.

Relationship to log-normal distribution

The geometric standard deviation is related to the log-normal distribution. The log-normal distribution is a distribution which is normal for the logarithm transformed values. By a simple set of logarithm transformations we see that the geometric standard deviation is the exponentiated value of the standard deviation of the log transformed values (e.g. exp(stdev(ln(A))));

As such, the geometric mean and the geometric standard deviation of a sample of data from a log-normally distributed population may be used to find the bounds of confidence intervals analogously to the way the arithmetic mean and standard deviation are used to bound confidence intervals for a normal distribution. See discussion in log-normal distribution for details.